home *** CD-ROM | disk | FTP | other *** search
- From: pcolmer@acorn.co.uk (Philip Colmer)
- Date: 26 Oct 90 12:55:54 GMT
-
- Time to put this one to rest, methinks ...
-
- When you format a hard drive, be it an ADFS drive or a SCSI drive, you can
- choose between "Old map" and "New map" formats. To see which format your
- hard drive is in, type
-
- *ADFS:Map :4
-
- (or :5, 6 or 7)
-
- If the output says "new map", you've got a new map format.
-
- Some facts about the new map format:
-
- * this format has the ability to store smallish files within the space taken
- on the disc by the directory. The maximum size of these files is the value
- given to the LARGE FILE ALLOCATION UNIT, or LFAU.
-
- * the LFAU is specified when you format the hard drive. Apart from
- specifying the maximum size of a file which can be fitted within a
- directory space (NB *NOT* the maximum size of a file which can be stored
- on the disc), it is also related to the CAPACITY of the drive.
-
- The larger the drive, the higher the LFAU needs to be in order for the
- drive to be able to be formatted. The following is a table of disc sizes
- versus possible LFAUs:
-
- Capacity LFAU "Size" of a directory (approx)
- ======== ==== ==============================
- Up to 124Mbytes 256 4K
- Up to 249Mbytes 512 8K
- Up to 499Mbytes 1024 16K
- Up to 999Mbytes 2048 32K
-
- The "up to" values are inclusive. You can, of course, specify a large
- LFAU, but the side effect is that the larger the LFAU, the more disc space
- a directory takes, hence the "wasted disc space".
-
- This arrangement, therefore, allows files up to the LFAU to be stored on
- the disc without taking up any more disc space, and without fragmenting
- the disc too much.
-
- * when saving files to a new format disc, FileCore will automatically try
- and coalesce blocks to avoid fragmentation.
-
- * to determine what value LFAU your drive has:
-
- DIM block% 64
- SYS"ADFS_DescribeDisc",":4",block%
- P."LFAU is ";2^(block%?5)
-
- Some facts about the old map format:
-
- * each directory always takes 4Kbytes
-
- * the disc is not automatically compacted.
-
- Note that ALL of the above applies to any FileCore based filing system. This
- includes ADFS and SCSIFS.
-
- Hope this clears everything up.
-
- --Fil.
-
-
-
- From: nick@abccam.abcl.co.uk (Nick Reeves)
- Summary: E FORMAT DESIGN DOCUMENT
- Date: 26 Oct 90 16:53:50 GMT
- Organization: Active Book Company Limited, Cambridge, UK
-
-
- Briefly:
- The large directory size is the price you pay for a disc structure
- which almost always keeps files contiguously, but can use fragments if
- it has to. The rest of the space is not always wasted as small files
- in the directory can use it, and only have their length rounded up to
- the sector size rather than the cluster size. They are also quick to
- access as they are close to the directory and don't need to update the
- free space map.
-
-
- NEW DISC FILE STRUCTURE FOR RISC OS
- ===================================
-
- MOTIVATION
- ----------
- The aim of the new file structure is firstly to remove the following
- restrictions of the original file structure.
- 1. files must be stored contiguously on disc giving "compaction required"
- and "can't extend" errors
- 2. there was a maximum number of entries in the free space map giving
- "map full" error
- 3. defective sectors could not be mapped out of floppy discs
-
- However most file structures which overcome these (eg MS-DOS and UNIX) pay a
- heavy penalty in performance for the following reasons. As files may be split
- up into several pieces the information on disc allocation is greatly
- increased. The structure(s) to keep track of free space and disc allocation
- are typically too large to be kept in RAM for hard discs. Therefore the disc
- allocation algorithms tend to be be designed to minimise scanning of these
- structures, rather than minimising fragmentation. This greatly speeds up the
- fragmentation of free space and files, as blocks are claimed and released.
- Even if utilities exist to rationalise disc structure an unintelligent
- allocation scheme quickly fragments things again. Fragmentation degrades
- performance since the parameters of standard disc drives are such that the
- time spent seeking between the fragments of a file will dominate the time
- spent transferring data unless the fragments are greater than a track length
- on average. Therefore if the unit of disc allocation is small fragmentation
- soon reduces most fragments to a few units in length giving slow performance,
- but if the allocation unit is big enough to give anything close to possible
- performance the allocation units are so big that a large part of the disc
- space is wasted in rounding up files to allocation units. These points led to
- the following additional aims.
-
- 4. The disc allocation and free space structure(s) should be small enough to
- be kept in RAM even for large hard discs.
-
- 5. The allocation strategies should be intelligent to slow the rate of
- fragmentation and should not produce small fragments.
-
- 6. Long term fragmentation will still occur so the allocation strategy should
- have the option of rationalising disc allocation, which should not produce
- a noticeable delay to the user.
-
-
- STRUCTURE OVERVIEW
- ------------------
- The disc structure is made up of the following parts
-
- 1. (hard discs only) A boot block containing the defect list and other
- information
- 2. A double copied map which allows you to deduce for any cluster (unit of disc
- allocation) whether it is allocated and if so to what position in which file
- 3. A hierarchical tree of directories
- 4. The remaining space is used for (fragments of) files or is unallocated
-
- 1 and 3 are the same as for the old file structure (see OldMap document)
- except that disc addresses of files and directories are replaced by system
- internal numbers (SIN). The SIN includes a file number which can be looked up
- in the map to find the allocated disc space.
-
-
- NEW MAP STRUCTURE
- -----------------
- Most of the map is bitstream, with each bit in the map being associated with
- the allocation of a fixed number of bytes of disc space. This size (called
- the bit size) is always a power of two. The cluster size is the maximum of
- the bit size and the sector size. The map is divided into zones by the natural
- sector boundaries. Floppies are limited to one zone in the present
- implementation. The structure of a zone is as follows
-
- start byte
- byte length use
-
- 0 1 this is a checksum on this zone calculated as below
- 1 2 offset in bits to first free space in zone, or 0 if none, with
- top bit always set
- 3 1 this byte when EORed with values for other zones should give FF
- 4 60 (only zone 0) the first 32 bytes are a disc record describing
- the disc, the other 28 bytes are 0, being reserved.
-
- The rest of the zone is the bitstream. All entries in the bitstream have the
- same format. The start of an entry is an id field (of width given in the disc
- record), the end is marked by a 1, and any other bits are set to 0. The
- bitstream contains the following three types of object with their respective
- interpretation of the id field.
-
- OBJECT TYPE ID VALUE
-
- a) (fragments of) files file number (minimum value 2)
- b) unallocated fragments bit offset of next free fragment in zone or 0
- c) disc defects always 1
-
- When searching for the fragments of a file zone containing the first fragment
- can be calculated from
-
- zone = file number DIV ids per zone
-
- Where ids per zone = map bits in zone DIV (width of id field + 1)
-
- An exception is the file number 2. This is reserved for the structure which is
- placed on an empty disc (boot block, map and root directory). Searching for
- file number 2 should start at zone = total zones DIV 2. This is to allow the
- root directory and map of hard disc to be placed near the middle.
-
- The rest of the file's fragments (if any) can be found by searching through
- the zones in ascending order, wrapping back to zone 0 if necessary, looking
- for fragments with the correct file number. However because of the allocation
- strategies given below it is rare for a file to be in more than one piece
- unless it has been extended or is very large.
-
- The need for an id field places a minimum size on a fragment. For
- example possible fragment sizes are 2K, 3K, 4K, 5K ... for an 800K
- floppy, or 12K, 13K, 14K, 15K ... for a 20 Mb hard disc with a ten
- zone 2.5K map. Thus a compact map representation (down to 1 bit per
- cluster if necessary) has been achieved by making it impossible to
- represent inefficient allocations, (ie those with small fragments).
-
- In order to avoid wasting disc space for small files, when a file or directory
- does not use the whole of its allocated fragment and the fragment is too small
- too split sharing is allowed. Directories can share fragments with their
- sub files and files in the same directory can share a fragment. Sharing is at
- sector rather than cluster resolution, saving disc space. This also improves
- performance, firstly because head movement is reduced since small files are
- closer to each other and their parent directory on average, and secondly the
- map does not always need to be modified for small files.
-
- The SIN of a file is a 24 bit value. Bits 0-7 are the sector offset+1 in the
- fragment if the file may share a fragment or 0 if not. Bits 8-23 are the file
- number.
-
-
- ALLOCATION STRATEGIES
- ---------------------
- Directories are allocated from the middle zone outwards and files
- produced by whole file operations are allocated from the zone
- containing their parent outwards. For hard discs as well as keeping
- files in the same directory close to each other and their parent
- directory, it also tends to keep small files near the middle of the
- disc and large files further out. Both these reduce head movement.
- When choosing disc space on openout it is put at the start of a zone
- which balances distance from parent directory with amount of free
- space in the zone.
-
- As all allocation is indirected through the map it is possible to
- re-allocate to reduce fragmentation without having to read or write
- directories and a zone compaction routine is available for use by the
- allocation routines. This is highly optimised both in the choice of moves (eg
- it can spot fragments of the same file that can be joined) and in the
- execution of the moves. It builds up lists of moves which it cancels,
- combines, joins, splits, collects together in groups to be done together, and
- sorts into an order that reduces head movement for the scatter read/write
- primitives of the device drivers. Small compactions happen in response to
- particular allocation needs and opportunities rather than compacting a whole
- zone or disc at once so it not usually apparent to the user.
-
- Files produced by whole file operations (eg SAVE) tend to be longer lived
- than those produced by partial file operations (eg OPENOUT, GBPB). So if
- they cannot be allocated a single extent compaction continues until a free
- space of the correct size or data totalling twice the length of the file has
- been moved. If compaction fails the file is allocated either a pair of
- extents which totals the correct size or a set of adjacent free spaces
- (possibly with an extent removed or added to give a better fit). The only
- partial file operations that do compaction are open and close, and then only
- if a very good opportunity is found. As files produced by whole file ops are
- almost always allocated a single extent, and long lived files produced by
- partial file ops tend to re-assemble their fragments as compactions happen,
- seeking between extents of a file is greatly reduced.
-
-
- MINOR DETAILS -------------
-
- The checksum on a zone is calculated/checked as below
-
- ;entry
- ; R0 -> start
- ; R1 zone length
-
- ;exit
- ; LR check byte, Z=0 <=> good
-
- NewCheck ROUT
- Push "R1,R2,LR"
- MOV LR, #0
- ADDS R1, R1, R0 ;C=0
- loop
- LDR R2, [R1, #-4] !
- ADCS LR, LR, R2
- TEQS R1, R0 ;preserves C
- BNE loop
- AND R2, R2, #&FF ;ignore old sum
- SUB LR, LR, R2
- EOR LR, LR, LR, LSR #16
- EOR LR, LR, LR, LSR #8
- AND LR, LR, #&FF
- CMPS R2, LR
- Pull "R1,R2,PC"
-
-
-
-
- The checksum on directories remains the same. But the validation string for a
- directory has changed from 'Hugo' to 'Nick'
-
-
- ; Directory Start
- ^ 0
- StartMasSeq # 1
- StartName # 4
- DirFirstEntry # 0
-
- ; Old Directory End
- ^ 0
- # -1
- DirCheckByte # 0 ;RETRO DEFINITION was reserved
-
- # -4
- EndName # 0
-
- # -1
- EndMasSeq # 0
-
- # -14 ;reserved
-
- DirTitleSz * 19
- # -DirTitleSz
- OldDirTitle # 0
-
- # -3
- OldDirParent # 0
-
- # -NameLen
- OldDirName # 0
-
- # -1
- OldDirLastMark # 0 ;dummy last entry marker
-
- ; New Directory End
- ^ 0
- # -1
- ASSERT DirCheckByte=@
-
- # -4
- ASSERT EndName=@
-
- # -1
- ASSERT EndMasSeq=@
-
- # -NameLen
- NewDirName # 0
-
- # -DirTitleSz
- NewDirTitle # 0
-
- # -3
- NewDirParent # 0
-
- # -1 ;reserved
- # -1 ;reserved
-
- # -1
- NewDirLastMark # 0 ;dummy last entry marker
-
-
-
- ; ================
- ; TestDirCheckByte
- ; ================
-
- ; entry
- ; R3 ind disc add of dir
- ; R5 -> dir start
- ; R6 -> dir end
-
- ; exit
- ; LR check byte
- ; Z clear if matches existing byte
-
- TestDirCheckByte
- Push "R0-R2,R5,R7,LR"
- BL EndDirEntries ;(R3,R5,R6->R0)
- BL TestDir ;(R3->LR,Z)
- ADDEQ R7, R6, #NewDirLastMark+1 ;dir tail
- ADDNE R7, R6, #OldDirLastMark+1
- 05
- BIC R1, R0, #3
- MOV R2, #0
- 10 ;whole words before end of entries
- LDR LR, [R5],#4
- EOR R2, LR, R2,ROR #13
- TEQS R5, R1
- BNE %BT10
- 20 ;odd bytes before end of entries
- LDRNEB LR, [R5], #1 ;not first pass through loop
- EORNE R2, LR, R2,ROR #13
- TEQS R5, R0
- BNE %BT20
-
- MOV R5, R7
- 30 ;odd bytes before at start of tail
- TSTS R5, #3
- LDRNEB LR, [R5],#1
- EORNE R2, LR, R2, ROR #13
- BNE %BT30
-
- ASSERT DirCheckByte=-1 ;dont include last word as it contains
- SUB R1, R6, #4 ;the check byte
- 40 ;whole words in tail
- LDR LR, [R5],#4
- EOR R2, LR, R2,ROR #13
- TEQS R5, R1
- BNE %BT40
-
- EOR R2, R2, R2, LSR #16 ;reduce to 8 bits
- EOR R2, R2, R2, LSR #8
- AND R2, R2, #&FF
-
- LDRB LR, [R6,#DirCheckByte] ;compare with old check byte
- TEQS R2, LR
- MOV LR, R2
-
- Pull "R0-R2,R5,R7,PC"
-
-
- ; =============
- ; EndDirEntries
- ; =============
-
- ; Find the end of the list of entries in a dir
-
- ; entry
- ; R3 ind disc add of dir
- ; R5 -> dir start
- ; R6 -> dir end
-
- ; exit
- ; R0 -> first empty entry
-
- EndDirEntries ROUT
- Push "R2,LR"
- BL TestDir ;(R3->LR,Z)
- ADDEQ R2, R6, #NewDirLastMark
- ADDNE R2, R6, #OldDirLastMark
- ADD R0, R5, #DirFirstEntry
- SUB R0, R0, #DirEntrySz
- 10 ;loop examining entries
- LDRB LR, [R0,#DirEntrySz] !
- CMPS LR, #0 ;until null entry
- CMPNES R0, R2 ;or exhausted
- BLO %BT10
- MOVHI R0, R2
- Pull "R2,PC",,^
-
-
-